Skip to content

perf: wire apply_diff_eq into all filter paths — O(candidates) vs O(base)#205

Merged
JustMaier merged 1 commit intomainfrom
ivy/apply-diff-eq-wiring
Apr 13, 2026
Merged

perf: wire apply_diff_eq into all filter paths — O(candidates) vs O(base)#205
JustMaier merged 1 commit intomainfrom
ivy/apply-diff-eq-wiring

Conversation

@JustMaier
Copy link
Copy Markdown
Contributor

Summary

  • Replace fused_cow (clones 13MB base on dirty bitmaps) with apply_diff_eq/union_with_diff across all 5 filter clause paths (Eq, In, In-complement, NotEq, NotIn)
  • These methods already exist in FilterField but were never wired into the executor hot path — bitmap scout P1 recommendation
  • Expected 2-10x improvement per dirty clause (O(candidates) instead of O(base))

Problem

Prod traces show filter evaluation taking 40-372ms on cache misses. The fused_cow path clones the entire 13MB base bitmap whenever a field is dirty (has pending diffs). At 109M records, this is the dominant P99 bottleneck after the doc-read fix (PR #204).

Changes

  • src/executor.rs: All filter paths in try_and_by_ref rewired from get_versioned + fused_cow to apply_diff_eq/union_with_diff

Test plan

  • Compiles clean (release build)
  • Pre-existing engine tests unaffected (Windows path issue is pre-existing)
  • Deploy to prod, compare query_filter_seconds P95/P99
  • Monitor for correctness — result counts should match (same logic, faster path)

🤖 Generated with Claude Code

…ase)

Replace fused_cow (clones 13MB base bitmap on dirty fields) with
apply_diff_eq/union_with_diff across ALL four filter clause paths:

1. Eq: ff.apply_diff_eq(key, acc) — O(candidates) AND
2. In (non-complement): ff.union_with_diff(&in_keys, acc) — single lock
3. In (complement): ff.union_with_diff(&exclude_keys, acc) for subtraction
4. NotEq: ff.apply_diff_eq(key, acc) then subtract
5. NotIn: ff.union_with_diff(&keys, acc) then subtract

Before: get_versioned + fused_cow = O(base) per dirty clause, cloning
the full base bitmap (13MB at 109M records). At 372ms filter time on
broad queries, this is the P99 bottleneck.

After: apply_diff_eq computes result at O(candidates) cost using the
diff overlay — no base clone needed. Expected 2-10x improvement per
dirty clause depending on candidate set size.

These methods already existed in FilterField but were never wired into
the executor hot path. Bitmap scout P1 recommendation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@JustMaier JustMaier merged commit 31bbb29 into main Apr 13, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant